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Abstract. I outline a method for estimating astrophysical parameters (APs) 
from multidimensional data. It is a supervised method based on matching ob- 
served data (e.g. a spectrum) to a grid of pre-labelled templates. However, unlike 
standard machine learning methods such as ANNs, SVMs or k-nn, this algorithm 
explicitly uses domain information to better weight each data dimension in the 
estimation. Specifically, it uses the sensitivity of each measured variable to each 
AP to perform a local, iterative interpolation of the grid. It avoids both the non- 
uniqueness problem of global regression as well as the grid resolution limitation 
of nearest neighbours. 



1. Introduction 

Consider the problem of estimating the astrophysical parameters (APs) of a star 
from its spectrum using a grid of pre-labelled spectra. Let p = {pi\, i = 1 . . . I he 
the data vector (spectrum or multiband photometry) and (p = j = 1 . . . J 

be the AP vector (e.g. T^s, log (7, etc.). Standard approaches involve performing 
a global regression on the grid to infer the mapping (j) = g{p), using, for example, 
a artificial neural network (ANN) (e.g. Bailer-Jones et al. 1998) or a support 
vector machine (SVM) (e.g. Tsalmantza et al. 2006). 

Although these methods meet with reasonable success, they have problems 
when it comes to estimating multiple APs, in particular if some APs have a 
relatively weak signature (as is the case with log 5 and [Fe/H]: compare the 
vertical scales in Fig. [T|). To overcome this we should weight the variables ac- 
cording to their sensitivity with respect to the APs of interest. In principle, 
ANNs and SVMs implicitly learn this weighting from the data, but this is dif- 
ficult with many noisy variables. Furthermore, a global regression approach is 
strictly flawed, because while the photon counts in a band varies uniquely with 
the APs, the converse is not true (Fig. [1]). The global regression is trying to 
solve an inverse problem and the lack of uniqueness could lead to a poor fit. 
This degeneracy problem is exacerbated at low spectral resolution and by noise. 

2. Basic idea 

The new algorithm addresses these issues by explicit use of the sensitivities, 
Sij{(f) = of each band i to each AP j. These are estimated by fitting a 

(smooth) function, pi = fi{(p) to each band, which I refer to as the forward model 
(Fig. [T|). Sensitivities are estimated from these via first differences. These are 
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Figure 1. Variation of photon counts with log (Tcff ) (top) and logg (bot- 
tom) in three filters (bands). A J-dimensional fit to each band (indepen- 
dently) is a forward model. This is a true function (unlike a fit to the inverse). 

used to improve the commonly-employed nearest neighbour (or minimization) 
technique. We locally interpolate the grid, defining "optimal" directions for 
interpolation using the sensitivities, and predicting the photon counts at off- 
grid points using the forward model. 

3. The algorithm 

Task: estimate APs of measured vector pQ. The core algorithm (for I = 1, J = 1) 
is as follows, where subscripts refer here to iterations (see Fig. [2]) 

1. Fit the forward model to the grid, p = f{(j)) 

2. Initialize: find nearest grid neighbour to po. Call this (pi,0i) 

3. Use the forward model to calculate the local sensitivities, ^ 

4. Calculate the discrepancy (residual), 6pn = Pn — Pn-i (= Pi — Po for the 
first iteration) 

5. Make a step in AP space, (/>n+i = tpn — (i^) ^Pn- This is the new AP 
prediction 

6. Use the forward model to predict the corresponding (off-grid) flux, Pn+i 

7. Iterate steps 3-6 

For the general case of / > 1, step 5 is simply an average over the update 
calculate for each band, i.e. Scj) = —J2i ^r^Pi- ^'^^ multiple APs (J > 1), we 
can write this in matrix format as 5(j) = —R5p, where R = [Rji] is the J x I 
matrix of reciprocal sensitivities, i.e. Rji = S^^ (mathematical rigour being 
sacrificed to some degree). The actual algorithm is a bit more complex (e.g. 
modified step size). 
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4. Application and results 

I apply the algorithm to a set of synthetic optical photometry of stars showing 
variation in T^s and logg. The / = 11 photometric bands are part of a system 
originally designed for Gaia (Jordi et al. 2006). The data show a large variation 
in APs (Fig. U]), and the grid is quite sparse, comprising just 233 objects. The 
test set contains 234 objects, with no AP combinations in common between the 
grid and test set. The signal-to-noise ratio of the test set has been reduced to 
10 per band. 

The algorithm applied is actually a simpler version of the general case de- 
scribed above: the forward model in steps 3 and 6 is a local linear interpolation 
(i.e. a plane) of the neighbours in AP space (not data space!) to the current 
point under consideration^] Thus the forward models are robust but subopti- 
mal (as we are no longer taking advantage of the reasonable assumption that 

Pi = fi{4>) is smooth). 

The progress of the algorithm in terms of the AP estimates is demonstrated 
in Fig. [3] for two of the 234 test cases. These have been selected to demonstrate 
both good and poor convergence for the two APs. 

The residuals of the estimates over all 234 objects are shown in Fig. U 
plotted against the true APs. The RMS errors are 1.65 dex in log 5 and 0.042 dex 
in log (Tcff ) (corresponding to 480 K at 5000 K). These compare to 0.99 dex for 
log g and 0.028 dex for log (Teg) (320 K at 5000 k) for an SVM model trained and 
tested on the same data. The systematic error in log g is also seen with the SVM 
and is characteristic of the weak AP problem (but not, I believe, unassailable). 



^Neighbours are selected so as to "surround" the current point in AP space as weU as possible, 
providing a sort-of "bracketing" (a concept which is only properly defined in one dimension). 
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Figure 3. AP estimate vs. iteration number for for two sources (left and 
right), showing examples of correct and incorrect convergence. The horizontal 
line shows the true parameters. 
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Figure 4. Parameter estimation residuals for logg (left) and log(Tcff) (right). 



5. Conclusions 

The algorithm currently performs slightly worse than one of the best generic 
regression algorithms available (an SVM), yet this is not bad considering that 
(a) it is in an early stage of development, and (b) the results were obtained using 
a (suboptimal) linear forward model. Unlike ANNs, the dimensionality of the 
algorithm's fitting depends on the number of APs (J), not the number of data 
dimensions (I), so it should scale well to typical spectral problems (J is a few, 
/ is a few thousand). Moreover, the method has the ability to detect and report 
multiple solutions which arise from degeneracies in the data (see Fig. [T|). 
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